NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

The Test of Tests: A Framework for Differentially Private Hypothesis Testing

Kazan, Zeki; Shi, Kaiyan; Groce, Adam; Bray, Andrew (July 2023, Proceedings of the 40th International Conference on Machine Learning)

We present a generic framework for creating differentially private versions of any hypothesis test in a black-box way. We analyze the resulting tests analytically and experimentally. Most crucially, we show good practical performance for small data sets, showing that at ϵ = 1 we only need 5-6 times as much data as in the fully public setting. We compare our work to the one existing framework of this type, as well as to several individually-designed private hypothesis tests. Our framework is higher power than other generic solutions and at least competitive with (and often better than) individually-designed tests.
more » « less
Climate warming amplifies the frequency of fish mass mortality events across north temperate lakes

https://doi.org/10.1002/lol2.10274

Tye, Simon P.; Siepielski, Adam M.; Bray, Andrew; Rypel, Andrew L.; Phelps, Nicholas B.; Fey, Samuel B. (December 2022, Limnology and Oceanography Letters)

Full Text Available
Global multi-model projections of local urban climates

https://doi.org/10.1038/s41558-020-00958-8

Zhao, Lei; Oleson, Keith; Bou-Zeid, Elie; Krayenhoff, E. Scott; Bray, Andrew; Zhu, Qing; Zheng, Zhonghua; Chen, Chen; Oppenheimer, Michael (February 2021, Nature Climate Change)
null (Ed.)
Full Text Available
Differentially Private Nonparametric Hypothesis Testing

https://doi.org/10.1145/3319535.3339821

Couch, Simon; Kazan, Zeki; Shi, Kaiyan; Bray, Andrew; Groce, Adam (November 2019, Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (CCS))

Hypothesis tests are a crucial statistical tool for data mining and are the workhorse of scientific research in many fields. Here we study differentially private tests of independence between a categorical and a continuous variable. We take as our starting point traditional nonparametric tests, which require no distributional assumption (e.g., normality) about the data distribution. We present private analogues of the Kruskal-Wallis, Mann-Whitney, and Wilcoxon signed-rank tests, as well as the parametric one-sample t-test. These tests use novel test statistics developed specifically for the private setting. We compare our tests to prior work, both on parametric and nonparametric tests. We find that in all cases our new nonparametric tests achieve large improvements in statistical power, even when the assumptions of parametric tests are met.
more » « less
Full Text Available
Evaluating a primary carbonate pathway for manganese enrichments in reducing environments

https://doi.org/10.1016/j.epsl.2020.116201

Wittkop, Chad; Swanner, Elizabeth D.; Grengs, Ashley; Lambrecht, Nicholas; Fakhraee, Mojtaba; Myrbo, Amy; Bray, Andrew W.; Poulton, Simon W.; Katsev, Sergei (May 2020, Earth and Planetary Science Letters)

Full Text Available
Improved Differentially Private Analysis of Variance

https://doi.org/10.2478/popets-2019-0049

Swanberg, Marika; Globus-Harris, Ira; Griffith, Iris; Ritz, Anna; Groce, Adam; Bray, Andrew (July 2019, Proceedings on Privacy Enhancing Technologies)

Abstract Hypothesis testing is one of the most common types of data analysis and forms the backbone of scientific research in many disciplines. Analysis of variance (ANOVA) in particular is used to detect dependence between a categorical and a numerical variable. Here we show how one can carry out this hypothesis test under the restrictions of differential privacy. We show that the F -statistic, the optimal test statistic in the public setting, is no longer optimal in the private setting, and we develop a new test statistic F 1 with much higher statistical power. We show how to rigorously compute a reference distribution for the F 1 statistic and give an algorithm that outputs accurate p -values. We implement our test and experimentally optimize several parameters. We then compare our test to the only previous work on private ANOVA testing, using the same effect size as that work. We see an order of magnitude improvement, with our test requiring only 7% as much data to detect the effect.
more » « less
Full Text Available

Search for: All records